The Open University ’ s repository of research publications and other research outputs Aggregating Research Papers from Publishers ’ Systems to Support Text and Data Mining : Deliberate Lack of Interoperability or Not ?

نویسندگان

  • Petr Knoth
  • Nancy Pontika
چکیده

In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all research papers, i.e. the collection of human knowledge so large no one can ever read in their lifetime, represents one of the most exciting opportunities for TDM. Although the Open Access movement, which has been advocating for free availability and reuse rights to TDM from research papers, has achieved some major successes on the legal front, the technical interoperability of systems offering free access to research papers continues to be a challenge. COnnecting REpositories (CORE) (Knoth and Zdrahal, 2012) aggregates the world’s open access full-text scientific manuscripts from repositories, journals and publisher systems. One of the main goals of CORE is to harmonise and pre-process these data to lower the barrier for TDM. In this paper, we report on the preliminary results of an interoperability survey of systems provided by journal publishers, both open access and toll access. This helps us to assess the current level of systems’ interoperability and suggest ways forward.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aggregating Research Papers from Publishers’ Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not?

In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...

متن کامل

Cross - Platform Text Mining and Natural Language Processing Interoperability PROCEEDINGS

In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...

متن کامل

The Open University ’ s repository of research publications and other research outputs How do some concepts vanish over time ?

This paper presents the current stage of my PhD research focused on the use of machine learning in supporting the human learn from examples. I present here an approach to answer how some concepts change their contexts in time, using two techniques suitable for indexing and data mining: latent semantic indexing (LSI) and the APRIORI algorithm.

متن کامل

The Open University ’ s repository of research publications and other research outputs Understanding research dynamics

Rexplore leverages novel solutions in data mining, semantic technologies and visual analytics, and provides an innovative environment for exploring and making sense of scholarly data. Rexplore allows users: 1) to detect and make sense of important trends in research; 2) to identify a variety of interesting relations between researchers, beyond the standard co-authorship relations provided by mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016